How much of the variation in the outcome variable is explained by the predictor variables?
How much of the variation in the outcome variable is explained by the predictor variables?
M <- read_excel("../data/Milk.xlsx", na = "NA") %>%
select(species, kcal.per.g, mass, neocortex.perc) %>%
drop_na() %>%
rename(Species = species,
Milk_Energy = kcal.per.g,
Mass = mass,
Neocortex = neocortex.perc) %>%
mutate(log_Mass = log(Mass))
fm_Multi <- lm(Milk_Energy ~ Neocortex + log_Mass, data = M) summary(fm_Multi)
## ## Call: ## lm(formula = Milk_Energy ~ Neocortex + log_Mass, data = M) ## ## Residuals: ## Min 1Q Median 3Q Max ## -0.250574 -0.039212 0.000633 0.072997 0.201985 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -1.085254 0.515281 -2.106 0.05372 . ## Neocortex 0.027931 0.008015 3.485 0.00364 ** ## log_Mass -0.096402 0.024749 -3.895 0.00162 ** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.1265 on 14 degrees of freedom ## Multiple R-squared: 0.5317, Adjusted R-squared: 0.4648 ## F-statistic: 7.948 on 2 and 14 DF, p-value: 0.004939
cor(y_hat, M$Milk_Energy)^2
## [1] 0.5317037
summary(fm_Multi)$r.squared
## [1] 0.5317037
Some total variability in \(y\):
\(F\)-statistic is the ratio of the two.
\[F = \frac{\mbox{Between Group Variation}}{\mbox{Within Group Variation}}\]
## Analysis of Variance Table ## ## Response: Shift ## Df Sum Sq Mean Sq F value Pr(>F) ## Treatment 2 7.2245 3.6122 7.2894 0.004472 ** ## Residuals 19 9.4153 0.4955 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Sum Sq: Variability accounted for by that part of the ANOVAMean Sq: Sum Sq / DfF value: Mean Sq Treatment / Mean Sq ResidualPr(>F): P-value for the F-test of that variable## Analysis of Variance Table ## ## Response: Shift ## Df Sum Sq Mean Sq F value Pr(>F) ## Treatment 2 7.2245 3.6122 7.2894 0.004472 ** ## Residuals 19 9.4153 0.4955 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Sum Sq: Variability accounted for by that part of the ANOVAMean Sq: Sum Sq / DfF value: Mean Sq Treatment / Mean Sq ResidualPr(>F): P-value for the F-test of that variable\[R^{2} = \frac{\mbox{Variation accounted for by group membership}}{\mbox{Total variation}}\]
\[R^{2} = \frac{\mbox{Sum Sq Group}}{\mbox{(Sum Sq Group + Sum Sq Residuals)}}\]
## Analysis of Variance Table ## ## Response: Shift ## Df Sum Sq Mean Sq F value Pr(>F) ## Treatment 2 7.2245 3.6122 7.2894 0.004472 ** ## Residuals 19 9.4153 0.4955 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(fm_lm)$`Sum Sq`[1]/sum(anova(fm_lm)$`Sum Sq`)
## [1] 0.4341684
7.224492/(7.224492 + 9.415345)
## [1] 0.4341684
## ## Call: ## lm(formula = Shift ~ Treatment, data = JL) ## ## Residuals: ## Min 1Q Median 3Q Max ## -1.27857 -0.36125 0.03857 0.61147 1.06571 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) -0.30875 0.24888 -1.241 0.22988 ## Treatmenteyes -1.24268 0.36433 -3.411 0.00293 ** ## Treatmentknee -0.02696 0.36433 -0.074 0.94178 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## Residual standard error: 0.7039 on 19 degrees of freedom ## Multiple R-squared: 0.4342, Adjusted R-squared: 0.3746 ## F-statistic: 7.289 on 2 and 19 DF, p-value: 0.004472